21 research outputs found

    Asymptotically optimal priority policies for indexable and non-indexable restless bandits

    Get PDF
    We study the asymptotic optimal control of multi-class restless bandits. A restless bandit is a controllable stochastic process whose state evolution depends on whether or not the bandit is made active. Since finding the optimal control is typically intractable, we propose a class of priority policies that are proved to be asymptotically optimal under a global attractor property and a technical condition. We consider both a fixed population of bandits as well as a dynamic population where bandits can depart and arrive. As an example of a dynamic population of bandits, we analyze a multi-class M/M/S+M queue for which we show asymptotic optimality of an index policy.We combine fluid-scaling techniques with linear programming results to prove that when bandits are indexable, Whittle's index policy is included in our class of priority policies. We thereby generalize a result of Weber and Weiss (1990) about asymptotic optimality of Whittle's index policy to settings with (i) several classes of bandits, (ii) arrivals of new bandits, and (iii) multiple actions. Indexability of the bandits is not required for our results to hold. For non-indexable bandits we describe how to select priority policies from the class of asymptotically optimal policies and present numerical evidence that, outside the asymptotic regime, the performance of our proposed priority policies is nearly optimal

    Asymptotically optimal priority policies for indexable and nonindexable restless bandits

    Get PDF
    We study the asymptotic optimal control of multi-class restless bandits. A restless bandit is a controllable stochastic process whose state evolution depends on whether or not the bandit is made active. Since finding the optimal control is typically intractable, we propose a class of priority policies that are proved to be asymptotically optimal under a global attractor property and a technical condition. We consider both a fixed population of bandits as well as a dynamic population where bandits can depart and arrive. As an example of a dynamic population of bandits, we analyze a multi-class M/M/S+M queue for which we show asymptotic optimality of an index policy. We combine fluid-scaling techniques with linear programming results to prove that when bandits are indexable, Whittle's index policy is included in our class of priority policies. We thereby generalize a result of Weber and Weiss (1990) about asymptotic optimality of Whittle's index policy to settings with (i) several classes of bandits, (ii) arrivals of new bandits, and (iii) multiple actions. Indexability of the bandits is not required for our results to hold. For non-indexable bandits we describe how to select priority policies from the class of asymptotically optimal policies and present numerical evidence that, outside the asymptotic regime, the performance of our proposed priority policies is nearly optimal

    Heavy-traffic analysis of a non-preemptive multi-class queue with relative priorities

    Get PDF
    International audienceWe study the steady-state queue-length vector in a multi-class single-server queue with relative priorities. Upon service completion, the probability that the next customer to be served is from class k is controlled by class- dependent weights. Once a customer has started service, it is served without interruption until completion. This is a generalization of the random-order-of-service discipline. We investigate the system in a heavy-traffic regime. We first establish a state-space collapse for the scaled queue length vector, that is, in the limit the scaled queue length vector is distributed as the product of an exponentially distributed random variable and a deterministic vector. As a direct consequence, we obtain that the scaled number of customers in the system reduces as classes with smaller mean service requirement obtain relatively larger weights. We then show that the scaled waiting time of a class-k customer is distributed as the product of two exponentially distributed random variables. This allows us to determine the value of the weights that minimize the m-th moment of the scaled waiting time for a customer of arbitrary class. We simulate a system with two different classes of customers in order to numerically verify some of the analytical results

    Asymptotically optimal index policies for an abandonment queue with convex holding cost.

    Get PDF
    International audienceWe investigate a resource allocation problem in a multi-class server with convex holding costs and user impatience under the average cost criterion. In general, the optimal policy has a complex dependency on all the input parameters and state information. Our main contribution is to derive index policies that can serve as heuristics and are shown to give good performance. Our index policy attributes to each class an index, which depends on the number of customers currently present in that class. The index values are obtained by solving a relaxed version of the optimal stochastic control problem and combining results from restless multi-armed bandits and queueing theory. They can be expressed as a function of the steady-state distribution probabilities of a one-dimensional birth-and-death process. For linear holding cost, the index can be calculated in closed-form and turns out to be independent of the arrival rates and the number of customers present. In the case of no abandonments and linear holding cost, our index coincides with the cÎŒc\mu-rule, which is known to be optimal in this simple setting. For general convex holding cost we derive properties of the index value in limiting regimes: we consider the behavior of the index (i) as the number of customers in a class grows large, which allows us to derive the asymptotic structure of the index policies, (ii) as the abandonment rate vanishes, which allows us to retrieve an index policy proposed for the multi-class M/M/1 queue with convex holding cost and no abandonments, and (iii) as the arrival rate goes to either 0 or ∞\infty, representing light-traffic and heavy-traffic regimes, respectively. We show that Whittle's index policy is asymptotically optimal in both light-traffic and heavy-traffic regimes. To obtain further insights into the index policy, we consider the fluid version of the relaxed problem and derive a closed-form expression for the fluid index. The latter is shown to coincide with the index values for the stochastic model in asymptotic regimes. For arbitrary convex holding cost the fluid index can be seen as the GcÎŒ/ΞGc\mu/\theta-rule, that is, including abandonments into the generalized cÎŒc\mu-rule (GcÎŒGc\mu-rule). Numerical experiments for a wide range of parameters have shown that the Whittle index policy and the fluid index policy perform very well for a broad range of parameters

    Sojourn time approximations for a discriminatory-processor-sharing queue

    Get PDF
    International audienceWe study a multi-class time-sharing discipline with relative priorities known as Discriminatory Processor Sharing (DPS), which provides a natural framework to model service differentiation in systems. The analysis of DPS is extremely challenging and analytical results are scarce. We develop closed-form approximations for the mean conditional (on the service requirement) and unconditional sojourn times. The main benefits of the approximations lie in its simplicity, the fact that it applies for general service requirements with finite second moments, and that it provides insights into the dependency of the performance on the system parameters. We show that the approximation for the mean conditional and unconditional sojourn time of a customer is decreasing as its relative priority increases. We also show that the approximation is exact in various scenarios, and that it is uniformly bounded in the second moments of the service requirements. Finally we numerically illustrate that the approximation for exponential, hyperexponential and Pareto service requirements is accurate across a broad range of parameters

    Actes du 11ùme Atelier en Évaluation de Performances

    Get PDF
    International audienceLe prĂ©sent document contient les actes du 11Ăšme Atelier en Évaluation des Performances qui s'est tenu les 15-17 Mars 2016 au LAAS-CNRS, Toulouse. L’Atelier en Évaluation de Performances est une rĂ©union destinĂ©e Ă  faire s’exprimer et se rencontrer les jeunes chercheurs (doctorants et postdoctorants) dans le domaine de la ModĂ©lisation et de l’Évaluation de Performances, une discipline consacrĂ©e Ă  l’étude et l’optimisation de systĂšmes dynamiques stochastiques et/ou temporisĂ©s apparaissant en Informatique, TĂ©lĂ©communications, Productique et Robotique entre autres. La prĂ©sentation informelle de travaux, mĂȘme en cours, y est encouragĂ©e afin de renforcer les interactions entre jeunes chercheurs et prĂ©parer des soumissions de nouveaux projets scientifiques. Des exposĂ©s de synthĂšse sur des domaines de recherche d’actualitĂ©, donnĂ©s par des chercheurs confirmĂ©s du domaine renforcent la partie formation de l’atelier

    Asymptotic Optimal Control of Markov-Modulated Restless Bandits

    Get PDF
    International audienceThis paper studies optimal control subject to changing conditions. This is an area that recently received a lot of attention as it arises in numerous situations in practice. Some applications being cloud computing systems where the arrival rates of new jobs fluctuate over time, or the time-varying capacity as encountered in power-aware systems or wireless downlink channels. To study this, we focus on a restless bandit model, which has proved to be a powerful stochastic optimization framework to model scheduling of activities. In particular, it has been extensively applied in the context of optimal control of computing systems. This paper is a first step to its optimal control when restless bandits are subject to changing conditions, the latter being modeled by Markov-modulated environments. We consider the restless bandit problem in an asymptotic regime, which is obtained by letting the population of bandits grow large, and letting the environment change relatively fast. We present sufficient conditions for a policy to be asymptotically optimal and show that a set of priority policies satisfies these. Under an indexability assumption, an averaged version of Whittle's index policy is proved to be inside this set of asymptotic optimal policies. The performance of the averaged Whittle's index policy is numerically evaluated for a multi-class scheduling problem in a wireless downlink subject to changing conditions. While keeping the number of bandits constant, we observe that the average Whittle index policy becomes close to optimal as the speed of the modulated environment increases

    Efficient scheduling in redundancy systems with general service times

    Get PDF
    We characterize the impact of scheduling policies on the mean response time in nested systems with cancel-on-complete redundancy. We consider not only redundancy-oblivious policies, such as FCFS and ROS, but also redundancy-aware policies of the form Π 1 − Π 2 , where Π 1 discriminates among job classes (e.g., least-redundant-first (LRF), most-redundantfirst (MRF)) and Π 2 discriminates among jobs of the same class. Assuming that jobs have independent and identically distributed (i.i.d.) copies, we prove the following: (i) When jobs have exponential service times, LRF policies outperform any other policy. (ii) When service times are New-Worse-than-Used, MRF-FCFS outperforms LRF-FCFS as the variability of the service time grows infinitely large. (iii) When service times are New-Better-than-Used, LRF-ROS (resp. MRF-ROS) outperforms LRF-FCFS (resp. MRF-FCFS) in a two-server system. Statement (iii) also holds when job sizes follow a general distribution and have identical copies (all the copies of a job have the same size). Moreover, we show via simulation that, for a large class of redundancy systems, redundancy-aware policies can considerably improve the mean response time compared to redundancy-oblivious policies. We also explore the effect of redundancy on the stability region
    corecore